VizioMetrics Open Data
Datasets
- Pubmed Central
Current release
- Image data with labels
- Paper metadata with Eigenfactor scores
Coming soon
- Ground truth image data from Pubmed Central
Cite our paper
@inproceedings{lee2016viziometrix,
author = {Lee, Poshen and West, Jevin and Howe, Bill},
title = {VizioMetrix: A Platform for Analyzing the Visual Information in Big Scholarly Data},
booktitle = {BigScholar Workshop (co-located at WWW)},
year = {2016}
}
Image Data
Our figure images are stored on Amazon S3 server. Each image needs it's own key to access (See AWS API). VizioMetrics API provides few ways to acquire these S3 keys. You can download the image files in following steps:
- Get the S3 keys via VizioMetrics Rest API
- Use the S3 keys to access the files
We currrent provide three methods to collect the S3 keys:
- GET keys by given PMID
- GET keys by given PMCID
- GET keys by full text searching
Get Image Keys
Use PMID or PMCID
Request URL
GET http://viziometrics.org/api/pmc/image/pmcid/{pmcid}/
or
GET http://viziometrics.org/api/pmc/image/pmid/{pmid}/
Return Schema
{
pmid (integer),
pmcid (integer),
doi (string),
longname (string), #Journal Name
title (string),
year_pub (string),
img_loc (string), #S3 key
class_name (string), #Figure Type
is_composite (integer), #Value = 1: Composite Figure; Value = 2: Singleton Figure
caption (string),
is_repret (integer) #Value = 1: Premier Figure of the Paper
}
Example Code
GET http://viziometrics.org/api/pmc/image/pmcid/2872101/
Response Body
[
{
"pmid": 20463734,
"pmcid": 2872101,
"doi": "10.1038/nature09026",
"longname": "Nature",
"title": "A Proximity-Based Programmable DNA Nanoscale Assembly Line",
"year_pub": "2010",
"img_loc": "pubmed/img/PMC2872101_nihms182085f1.jpg",
"class_name": "diagram",
"is_composite": 0,
"caption": "The molecular assembly line and its operation(a) The basic components of the system are the origami tile (shown as a tan outline), programmable 2-state DNA machines inserted in series into the file (shown in blue, purple and green), and the walker (shown as a trigonal arrangement of DNA double helices in red). The cargo of the machines consists of a 5 nm gold particle, a coupled pair of 5 nm particles or a 10 nm particle (indicated by brown dots), with their states labeled as PX (meaning ON or donate cargo) and JX2 (meaning OFF or do not donate cargo). In the example shown, the walker collects cargo from each machine. (b) Atomic force micrographs of the system corresponding to the process steps sketched in the right panels (a). AFM was performed by tapping in air; this mode of AFM results in only the nanoparticles and the origami being visible, and the individual nanoparticle components are not resolved from each other. Owing to the washing procedures between steps, the AFM images are not of the same individual assembly line. All scale bars are 50 nm.\r",
"is_repret": 0
},
{
"pmid": 20463734,
"pmcid": 2872101,
"doi": "10.1038/nature09026",
"longname": "Nature",
"title": "A Proximity-Based Programmable DNA Nanoscale Assembly Line",
"year_pub": "2010",
"img_loc": "pubmed/img/PMC2872101_nihms182085f2.jpg",
"class_name": "diagram",
"is_composite": 0,
"caption": "Details of the Walker, Movement, and Cargo Transfer(a) Walker structure: The drawing at left is a stick figure indicating the three hands (H1-H3) and four feet (F1-F4). The image at right shows the strand structure. (b) Movement: Walker reactions are in panels i and ii, and movement on the origami is in panels iii and iv. Figure S5 shows the complete walker transit. (c) Cargo transfer: (i) The PX state brings the arm of cassette 1 close to hand H1. (ii) The brown toehold binds its complement (red). (iii) Branch migration transfers the cargo strand to hand H1.\r",
"is_repret": 0
},
{
"pmid": 20463734,
"pmcid": 2872101,
"doi": "10.1038/nature09026",
"longname": "Nature",
"title": "A Proximity-Based Programmable DNA Nanoscale Assembly Line",
"year_pub": "2010",
"img_loc": "pubmed/img/PMC2872101_nihms182085f3.jpg",
"class_name": "composite",
"is_composite": 1,
"caption": "The Eight Products of the Assembly LineThe small Roman numerals indicate the different pathways illustrated in panels (a), (b) and (c). (a) The eight possible products that can be generated through appropriate programming of the state of the three DNA machines. The walker is shown at the left, without cargo. Each DNA machine is shown twice: in the upper row in the OFF state where no cargo transfer takes place, and in the lower row in the ON state where cargo can be transferred to the walker. The different assembly trajectories are color coded as black, dark blue, rose, brown, yellow, light blue, green, and magenta, giving products i-viii, respectively, shown schematically at right. (b) Schematic of the final state the system reaches for each of the eight assembly pathways. The states of the cassettes and the dispositions of the cargo species (attached to the robot arms or attached to the walker) are visible. (c) TEM images of the products generated in each of the assembly pathways. (Note that TEM resolves the individual gold Nanoparticles.) In each image, several products generated by the given pathway are visible. All scale bars are 50 nm.\r",
"is_repret": 1
}
]
Search keywords in abstract, title, and caption
Request URL
GET http://viziometrics.org/api/pmc/search?keywords={keywords}&number={number}&qrandom={qrandom}
Parameters
Required keywords(varchar): the keywords appearing in captions or paper abstract
Optional number(integer): the number of images that will be returned, (default = 20)
Optional qrandom(boolean): random order, (default = false, order in descending by impact score)
Return Schema
{
pmid (integer),
pmcid (integer),
doi (string),
longname (string), #Journal Name
title (string),
year_pub (string),
img_loc (string), #S3 key
class_name (string), #Figure Type
is_composite (integer), #Value = 1: Composite Figure; Value = 2: Singleton Figure
caption (string),
is_repret (integer) #Value = 1: Premier Figure of the Paper
}
Example Code
GET http://viziometrics.org/api/pmc/image/search/?keywords=dna%20origami&qrandom=true&number=3
Response Body
[
{
"class_name": "diagram",
"pmcid": 2836238,
"is_composite": false,
"doi": "10.1038/nnano.2009.5",
"is_repret": 0.0,
"title": "Dynamic Patterning Programmed by DNA Tiles Captured on a DNA Origami Substrate",
"caption": "",
"year_pub": "2009",
"img_loc": "pubmed/img/PMC2836238_NIHMS86777-supplement-5.jpg",
"longname": "Nature nanotechnology", "pmid": 19350035
},
{
"class_name": "diagram",
"pmcid": 3496572,
"is_composite": false,
"doi": "10.1186/1471-2105-13-138",
"is_repret": 0.0,
"title": "EGNAS: an exhaustive DNA sequence design algorithm",
"caption": "Influence of the GC content and GC ends on the set size. Dependence of the set sizes on the GC content. Sets of 10 bases long sequences with Lc = 4 for global criton rules were generated. Averages standard deviations were calculated from 1,000 sets for restricted and from 10,000 sets for unrestricted conditions [0; 100]. The maximum set size N(max) = 17 is shown.\r",
"year_pub": "2012",
"img_loc": "pubmed/img/PMC3496572_1471-2105-13-138-3.jpg",
"longname": "BMC Bioinformatics", "pmid": 22716030
},
{
"class_name": "composite",
"pmcid": 3794576,
"is_composite": true,
"doi": "10.1093/nar/gkt592",
"is_repret": 0.0,
"title": "Controlling the stoichiometry and strand polarity of a tetramolecular G-quadruplex structure by using a DNA origami frame",
"caption": "The snapshots of the real-time HS-AFM imaging of the conformational changes. (a) Salt-induced formation of a G-quadruplex. The origami was prepared and immobilized on mica surface in a KCl-free buffer, while the imaging was carried out in a buffer that contained 100 mM KCl. (b) The deformation of a G-quadruplex structure under KCl-free environment. The origami was prepared and immobilized on mica surface in a buffer containing 100 mM KCl, whereas the observation buffer contained no KCl. The long duplex system (67-mer top and 77-mer bottom duplexes) with six G-repeats was used in these studies. The numbers at the top left corner represent the imaging time in second. Image size: 125 125 nm; scan speed: 0.2 frame/s. [MgCl2] = 10 mM; [TrisHCl] = 20 mM, pH 7.6. For real-time movies, see Supplementary section.\r", "year_pub": "2013", "img_loc": "pubmed/img/PMC3794576_gkt592f4p.jpg", "longname": "Nucleic Acids Research", "pmid": 23863846}, {"class_name": "composite", "pmcid": 2881423, "is_composite": true, "doi": "10.4110/in.2010.10.2.35", "is_repret": 0.0, "title": "Expression of a Functional zipFv Antibody Fragment and Its Fusions with Alkaline Phosphatase in the Cytoplasm of an ", "caption": "Western blot showing the expression of the VH-Fos-myc, the VH-Fos-myc-AP or the VL-Jun-AP fragments of the zipFv-112, the zipFv-112(H-AP) and the zipFv-112(L-AP). Origami(DE3) cells expressing the zipFv-112, the zipFv-112(H-AP) and the zipFv-112(L-AP) were grown in 2 YT/amp medium supplemented with 0.1 mM IPTG, and total proteins in the cell lysates were separated by using 12% SDS-PAGE at reducing (A, lane 1 on B) and non-reducing (Lane 2 on B) condition. Western blot was performed using either mouse anti-myc tag mAb followed by goat anti-mouse IgG-AP conjugated or NBT/BCIP substrate directly as described under Materials and Methods to detect the presence of the VH-Fos-myc and the VH-Fos-myc-AP, or AP activity of the VH-Fos-myc-AP and the VL-Jun-AP polypeptides, respectively.\r",
"year_pub": "2010",
"img_loc": "pubmed/img/PMC2881423_in-10-35-g005.jpg",
"longname": "Immune Network",
"pmid": 20532123
}
]
View VizioMetrics API document powered by Swagger.
Get Images from S3
Request URL
GET http://s3-us-west-2.amazonaws.com/escience.washington.edu.viziometrics/{s3_key}
View VizioMetrics API document powered by Swagger.
Paper Data
Obtain paper metadata and impact scores measured by Eigenfactor.
Get Paper Metadata
Use PMID or PMCID
Request URL
GET http://viziometrics.org/api/pmc/paper/pmcid/{pmcid}/
or
GET http://viziometrics.org/api/pmc/paper/pmid/{pmid}/
Return Schema
{
pmid (integer),
pmcid (integer),
doi (string),
longname (string),
title (string),
abstract (string),
authors (string),
year_pub (string),
ef (number)
}
Example Code
GET http://viziometrics.org/api/pmc/paper/pmid/19350035/
Response Body
[
{
"pmid": 19350035,
"pmcid": 2836238,
"doi": "10.1038/nnano.2009.5",
"longname": "Nature nanotechnology", #Journal Name
"title": "Dynamic Patterning Programmed by DNA Tiles Captured on a DNA Origami Substrate",
"abstract": "",
"authors": "Seeman Nadrian C., Xiao Shou-Jun, Chao Jie and Gu Hongzhou",
"year_pub": "2009",
"ef": 1.23617e-7 #Article-level Eigenfactor Score
}
]
View VizioMetrics API document powered by Swagger.