Frequently Asked Questions
Codegen.eu generates an online report from your raw genetic data and offers you the tools to easily explore it, free of charge. Browse for diseases or traits in a modern search-like interface, explore the links between over 2000 diseases (grouped in 15 major topics) and your genome - the relevant things are listed first thanks to algorithms.
Don't know where to start exploring? Simply go to the dashboard to see your top 10 variants impact curve vs the general population, as well as nutrition, fitness and other relevant information.
The raw genetic data that Codegen.eu can process must specify the rsid, chromosome, position and genotype for all SNPs listed in the uploaded file. Below is an example of what data the uploaded file should contain in order to generate a report and in what format it is expected (each SNP information should be listed on a separate line, column entries can be separated by space, tab, comma or colon):
# rsid chromosome position genotype
rs4477212 1 82154 AA
rs3094315 1 752566 AA
rs3131972 1 752721 GG
rs12124819 1 776546 AA
rs11240777 1 798959 GG
You can upload your data either archived (recommended - both zip and gzip formats are supported) or as a text file (both txt and csv formats are supported).
We currently support the raw genome data from providers like 23andMe, Genes for Good etc. However you should be able to use this service even if your raw genome data provider is not listed, if your data is in the supported format mentioned above.
You can always generate your report at a later time by simply re-uploading your raw genetic data. We do not maintain named accounts or the raw genetic data file in order to protect user anonymity.
We store a hash of the DNA data so that when a user with the same input DNA data is encountered the hash matches and we can retrieve favorites and other settings for the corresponding hash. Basically your DNA signature is your login. However if you have selected Erase all data option at logout, all your personal information has been deleted irreversibly. You can regenerate your report by uploading your raw genetic data file, but any settings or bookmarked genotypes associated to your data are no longer available.
The information provided in the reports is based on 3rd party sources like SNPedia.com and dbSNP (US National Library of Medicine). The reports are generated by applying ML algorithms trained on usage data and human labelled data.
Codegen.eu is not responsible or liable for the accuracy, usefulness or availability of any information transmitted or made available via the site. The service is offered for informational purposes only and should not be considered medical advice. Always consult with a qualified physician for diagnosis and for answers to your personal questions.
Each generated report has between 1500-3000 pages. Due to performance reasons and because most genotypes will have close to no impact we don't offer the option to print the entire report.
However partial reports can be printed, for example the report corresponding to a given topic or the summary page. When accessing our service from a laptop or desktop computer, look for the Print view button on the right hand side of the page of a topic.
The colored badge (green, yellow, red or blue) is an indication of the relevance of a given genotype for you. The color is correlated with the associated impact score, for example: good:3.2 means that the genotype has a score of 3.2 and positive impact for you.
The impact score has a value between 0 and 10 that was machine learned based on human labelled examples. In absolute values the scale is quasi-logarithmic:
- ≤ 2: non actionable interesting information
- 3: this might be useful/actionable
- 4: you are male or female
- 5: this could impact you more than your sex
- 6: medium-high probability of a trait or condition
- ≥ 7: significant positive trait or serious condition...
The genotype frequency indicates the frequency of the given genotype within the population. A genotype with a percentage of 2.27% is more relevant to you since less people have it. The blue color is used only for better visibility of the text.
The letters for some SNPs (i.e. rs3135391 or rs3135388) have been swapped because the strandedness (also known as orientation) differs between the raw genetic data and the way the genotype is listed in SNPedia, the main source of genotype descriptions. For example, 23andMe reports the genotypes based on the plus strand, while SNPedia lists some genotypes with the minus orientation (read this SNPedia article for more information about orientation).
The negative (or positive) topic number for each topic represents a weighted sum of the scores of
all genotypes included in that topic. The genotypes labelled as bad or warning are
summed with a negative sign, and all the others have a positive sign.
For example, a topic score of -9.8 means that there were most likely a couple of bad genotypes with a high score, that outweighed any good genotypes for that topic. However for individual genotypes, the ones with a score greater than 4 (regardless of the label - good or bad) are more likely to be relevant than the rest. The magnitude of individual genotypes is likely to be more relevant than the topic score.