Back to datasets
ClinVar Pathogenic Variants
ClinicalClinically curated pathogenic variant database with detailed phenotype associations.
45 GB1,250,000 data itemsLevel 2Updated 2026-01-20
Clinical variantsPathogenicityACMG classification
Schema preview
Schema Verified| Column | Type | Description | Nullable |
|---|---|---|---|
| sample_id | VARCHAR(50) | Unique sample identifier | No |
| patient_age | INTEGER | Patient age | Yes |
| diagnosis | VARCHAR(200) | Diagnosis code | No |
| gene_expression | ARRAY<FLOAT> | Gene expression vector | No |
| variant_calls | JSON | Variant call information | Yes |
| collection_date | DATE | Sample collection date | No |
Sample data
Showing 3 mock records| sample_id | patient_age | diagnosis | gene_expression | variant_calls | collection_date |
|---|---|---|---|---|---|
| SMP-001 | 45 | C34.1 | [0.23, 0.87, ...] | {...} | 2025-06-15 |
| SMP-002 | 62 | C50.9 | [0.45, 0.12, ...] | {...} | 2025-07-22 |
| SMP-003 | 38 | C18.0 | [0.78, 0.34, ...] | {...} | 2025-08-10 |
Usage guide
Data access
This dataset must be accessed in a secure TEE environment. Create a new workspace via Projects; the dataset will be automatically mounted into your Jupyter environment.
Compliance statement
This dataset has passed IRB review. Use must comply with the Public Domain license/terms. Do not attempt re-identification or use the data for insurance underwriting or similar purposes.
Citation
@dataset{ds-004,
title = {ClinVar Pathogenic Variants},
provider = {NCBI},
year = {2025},
license = {Public Domain}
}Billing model
Per data item pricing
$0.0003/data item/30 days
You are billed for the number of data items you access. Each data item is licensed for 30 days, with unlimited reads during the license window.
Metering policy
- •Unique Data Item: billed once per unique data_id; repeat reads within 30 days are not charged again
- •Auto-renew: after the license expires, continued access automatically renews the data item license
- •Volume discounts: tiered discounts apply for purchases over 1M data items
100k data items / 30 days$30
Full dataset / 30 days$375
Provider
NCBI
Public Domain