Multi-elemental and multi-dimensional data are more and more important in the development ofdata-driven research, as is the case in modern paleontology, in which, in an examination by experts, or one day artificial intelligence. Construction of integrated dataset of fossil specimens is essential to basic study of paleontology and stratigraphy, helps to shale gas exploration, and furthermore promotes the artificial intelligence applying in paleontology and stratigraphy.
Recently, Prof. XU Honghe, Big Data Center of the Nanjing Institute of Geology and Palaeontology, Chinese Academy of Sciences (NIGPAS), led an informatic paleontological study based on fossil specimen data. The research results were published in an international, interdisciplinary, and open-accessed top journal of Earth system sciences journal, Earth System Science Data.
“It took us over two years to finish the construction of this multi-dimensional dataset of graptolite specimens,” says XU, “our process includes specimen selection, curation and revision, scientific information summarizing, photography, data cleaning, and cloud storage and backup.”
This dataset includes 1550 graptolitespecimensthe Ordovician-Silurian strata of South China, which are significant in global stratigraphic correlation and shale gas exploration,and it covers 113 graptolite species or subspeciesin systematic classification. The dataset contains 2951 high-resolution images and a datatable of each specimen’s scientific information, including the taxonomic, geologic, and geographic information,comments, and references.
“Our dataset provides images for specialists or general users worldwide, is supported bythe tool FSIDvis (Fossil Specimen Image Dataset Visualizer), which we developed to facilitate the interactiveexploration of the rich-attribution image dataset, and includes a nonlinear-dimension reduction technique, t-SNE(t-distributed stochastic neighbor embedding), to project image data into a two-dimensional space to visualizeand explore the similarities.” XU says.Individual specimens are denoted by different colors and grouped in the visualization. These groups also taxonomically match different graptolite families. The dataset potentially contributes to virtual examinations of specimens, globalbio-stratigraphic correlation, and improvement of the shale gas exploration efficiency.
This work is a contribution to the Deep-time Digital Earth (DDE) Big Science Program.
Reference: Xu, H.-H., Niu, Z.-B., Chen, Y.-S., Ma, X., Tong, X.-J., Sun, Y.-T., Dong, X.-Y., Fan, D.-N., Song, S.-S., Zhu, Y.-Y., Yang, N., and Xia, Q. 2023. A multi-dimensional dataset of Ordovician to Silurian graptolite specimens for virtual examination, global correlation, and shale gas exploration. Earth Syst. Sci. Data. 15, 2213–2221, https://doi.org/10.5194/essd-15-2213-2023.
Fig 1. The process of creating the graptolite specimen image dataset.
Fig 2. Geographic distribution (a) and geologic range (b) of graptolite species of our dataset.
Fig 3. t-SNE visualization of our graptolite specimen images.
Contact:
LIU Yun, Propagandist
Email: yunliu@nigpas.ac.cn
Nanjing Institute of Geology and Palaeontology, Chinese Academy of Sciences
Nanjing, Jiangsu 210008, China
Download: