dNTPpoolDB comprises quantitative data on dNTP levels measured in biological samples. We aimed at including published measurements of all nucleotides that can be incorporated into DNA. dNTPpoolDB is manually curated, each entry is created by a competent annotator. dNTPpoolDB was designed to incorporate all relevant information on dNTP measurements from the source articles, including unambiguous specification of the source organism, tissue/cell type, subcellular compartment, the applied extraction and measurement methods, measurement parameters and units, and potential treatments. A significant added value of dNTPpoolDB is to extract and organize these pieces of information often vaguely mentioned in the original articles. The classification and association of the information content from widely different methods and data presentation makes a large number of measurements, otherwise separated, comparable.
Entries and poolsEntries can be reached from the Browser page. Each entry of the database corresponds to one measurement of an individual dNTP or, rarely, a non-separated combination of dNTPs measured under a set of conditions. The entry contains all information relevant to the dNTP measurement extracted from the source article. The measured dNTP quantity was either numerically written in the source article, or we extracted it from figure diagrams. Measurements are grouped into pools whenever the levels of at least two different dNTPs have been measured in the same study under the same conditions. Where available, pool information is included into the entry pages in the form of bar charts. By bringing the mouse pointer over the chart, the exact quantity represented by the bar can be visualized. Inter-linked pooled entries can be reached from the entry page of any member of the pool. To ensure high-level integration and consistency of our data with the existing biological data resource infrastructure, the details of each dNTPpoolDB entry are cross-referenced to the relevant databases, including Europe PMC and PubMed for the identification of source publication, NCBI Taxonomy for the identification of species, Cell Line Ontology (CLO) for the identification of cell lines, PubChem for the identification of small molecules, and UniProt for unequivocal protein identification.
Emphasis has been given to appropriately classify the dimensions of the measured dNTP quantities. This is crucial for the interpretation and comparability of the measurements. We refrained from converting dimensions other than a simple decimal conversion. In many publications, dNTP quantitation data are only given in relative terms normalized to control (e.g. wild type sample, all dNTPs or NTPs). These data were also included in the database (cf. Statistics page “dNTP quantity term”) as dNTP ratios within a pool, as well as dNTP pool alterations are informative. In these cases, we unambiguously specified the relation between the measured values.
Most of the data is presented in the literature in a graphical format (cf. Statistics page "Data presentation in publications"). The information content of these diagrams was extracted using the WebPlotDigitizer tool By converting several thousands of graphical dNTP quantitation points into numerical format, we made these valuable data accessible for subsequent bioinformatics analysis. Graphical to numerical representation conversion is indicated in the entry pages. Whenever error bars were shown in the diagrams, the error value was also extracted and included in the database. In case of SD, SE, SEM terms, error is given as a number. When an error range, interquartile range or confidence interval was given, we specified it indicating the lower and the upper limits. Whenever the biological sample was subjected to treatment or genetic modification, it is specified comprising the chemical agent used and the gene/protein involved using cross references to the relevant databases.
The Search function is available in three formats.
- On the Browser page, you can choose from pre-defined controlled vocabularies in the header of each column to select for a subset of data. Here, you can quickly reorganize your selection by main categories. This option does not offer multiple choices within categories.
- On the Browser and Advanced search pages, a free word finder is included to enable search in all categories including those not defined by controlled vocabularies e.g., experiment details, remarks and growth conditions.
- The advanced search page is dedicated to make a selection of entries using a large set of criteria, allowing for multiple choices within categories.
Pairwise comparisonThe pairwise comparison function is useful for quickly assessing differences between two entries and to detect dNTP pool changes in function of a desired parameter. This function can be reached 1) through the Browser page, 2) through the Advanced search page, 3) from the Entry page and 4) directly in the dedicated Pairwise comparison page. In the Browser and Advanced search pages, you can select two entries to be compared using the selection box on the left. Then by hitting the Compare button on the upper right, the Pairwise comparison page shows up with the result. Although individual entries are selected, the entire pools are compared if applicable. If the dimensions of the measured dNTP values match, the comparison is done in a single bar chart. We alert users whenever datasets are not directly comparable. Below the graphical comparison, you will find all information available in the Entry pages in a transparent comparison layout. Alternatively, you can introduce two entry or pool identifiers into the relevant query boxes in the Pairwise comparison page and hit the compare button.
Whenever we could identify control and treatment measurements in the source publication, we aimed to pair these data. If control-treatment pairs are available, the compare button is offered directly below the related pool in the Entry pages to make the graphical comparison of the cohesive pools effortlessly feasible.
Data downloadThe entire content of the database can be obtained under the Downloads option, while a subset of it selected using search criteria or a single entry can be downloaded on the Browser/Search pages and Entry pages, respectively. We offer three different formats, tsv, xml and json.
StatisticsThe pie charts of the statistics page are customizable. For instance, clicking on the Bacteria category below the "Taxonomic group" pie chart will exclude Bacteria and show the distribution of remaining dNTP data among eukaryotic taxonomic groups only. By bringing the mouse pointer over different fractions of the chart, you can see the exact number of entries represented.
We are constantly adding new data to dNTPpoolDB. Any suggestion on adding your or other published data is welcome!
Rita Pancsa, Erzsébet Fichó, Dániel Molnár, Éva Viola Surányi, Tamás Trombitás, Dóra Füzesi, Hanna Lóczi, Péter Szijjártó, Rita Hirmondó, Judit E Szabó, Judit Tóth; dNTPpoolDB: a manually curated database of experimentally determined dNTP pools and pool changes in biological samples, Nucleic Acids Research, 2021. gkab910