Finding and properly assessing strong sources of epidemiological data is becoming more and more important in this age of evidence-based medicine. With a general shift towards a focus on more niche and rarer diseases within the pharmaceutical industry, there is a growing need for accurate and relevant data.

With a smaller amount of general data available for many of these more niche and currently “un-tapped” disease markets, there is tremendous value in understanding and identifying the correct patient profile as early as possible.

By better understanding the causes of such diseases, innovative product developers will be able to outline a more targeted drug candidate profile, which will encourage the design of better clinical trials.

The requirement of this richer set of patient data means more in-depth information is required during the initial research stage. The required information would be:

Co-morbidities – the presence of one or more additional disorders (or diseases)

Pathophysiology – functional changes associated with a disease or syndrome

Vital signs – measurement of a body’s function (i.e. blood pressure)

Genetic profiles – particularly important with rare diseases and paediatric specific conditions

FREE Download: How To Evaluate Good Epidemiology Data

Finding Suitable Epidemiological Data

Knowing the type of data you are looking for is a crucial step in epidemiology research. However, being aware of what kind of data you need is only useful if you can find accurate, reliable and up to date sources.

Finding such information can be a minefield. Thanks to increasing access to publicly available patient data, it is becoming increasingly challenging for analysts to determine the relevance of information when creating patient profiles.

The task is only going to get more difficult with the amount of available healthcare data expected grow exponentially in the coming years.

There are a number of online sources of quality information available such as Medline (, National Health And Nutrition Examination Survey (NHANES) and the Surveillance Epidemiology and End Results (SEER).

The trouble is that even these high-quality sources have some limitations and need to be carefully weighted and scaled before applying to the general population. When you factor in the large volume and inconsistent quality of publications related to a given disease area as well, it becomes extremely tough to pick out relevant and applicable information.

Limitations of publicly available information:

  • Data may not have been collected expressly for the purpose of epidemiological research
  • Biased sampling and data representation
  • Lack of recent, up to date data
  • Need to utilise several different databases to obtain a comprehensive reference collection
  • High levels of skilled interpretation and data manipulation are often required

A Better Alternative For Epidemiological Data

Due to the limitations stated above, a better source of the required data may be to use purpose built, highly specialised databases such as the Epiomic segmentation database (

This database provides a much more robust, evidence-based source for patient population data. The Epiomic database goes beyond basic prevalence or incidence and segments patient populations according to relevant biomarkers, clinical parameters and co-morbid conditions.

This more detailed data segmentation supports more accurate and robust product valuations and forecasts.

What To Expect From A Database

A solid epidemiology database should be assessed and judged by a stringent set of criteria. A good database will have the ability to deliver the following:

  • Top line prevalence or incidence populations depending on the specific disease.
  • Ability to split the relevant patient population by gender
  • Breakdown by 5-year or relevant age cohort
  • The inclusion of patient sub-populations based on important vital signs, pathophysiology or co-morbidities
  • User-specified biometric distributions. The ability to specify cut points for population distributions such as blood pressure, lipid profile, lung function, BMI and kidney function to generate unique patient segments
  • Highlight differences between countries across the globe
  • Not being reliant on a single source of information but to triangulate the outputs between the different sources to remove any potential bias and generate a more robust result

Having access to a database that can deliver the above criteria will provide benefits to a pharma company in key areas like commercial planning and business development, the development of clinical trials and health outcomes research/market access functions.

The Epiomic database ( is an example of a database that can deliver data to match these crucial criteria.

What The Epiomic Database Provides

The Epiomic database by Black Swan Analysis is far and beyond any publicly available information but also differs from other commercial epidemiology databases.

Here is what you can expect from the data inside the unique Epiomic database:

  • Robust & Reliable – rigorous enough to stand up to scrutiny from regulatory bodies
  • Rare disease coverage – applying proprietary incidence disease modelling techniques which incorporate the often-limited patient data available for rarer diseases, we have generated estimated numbers of patients for 35+ rare diseases
  • Quality - utilising the most up to date information from patient registries, clinical trials and epidemiology studies to generate accurate patient population estimates. All diseases are routinely updated every 6 to 12 months, with more frequent updates to diseases of greater interest and focus of publications
  • Breadth and depth – covering over 160+ diseases which includes over 11,500+ unique subpopulations
  • Extensive In-patient data – hospital admissions & procedures data is available by ICD-9 code at the 3-digit level for most EU member countries
  • Flexibility – subscribers can select disease data for a 10 to 100-year forecast period split by gender and 5-year age cohort
  • Intuitive – the online interface that makes it simple to navigate through the site and between different diseases and sub-populations. With rapid access to data, easy extraction and quick Excel downloads
  • Biometric distributions – specify cut-points for population distributions such as; blood pressure, lipid profile, lung function, BMI, kidney function etc, to generate unique patient segments within the overall disease or specific disease profile

The Epiomic database also boasts coverage of a comprehensive range of over 160 prominent & rare diseases, including over 11,500 sub-populations, organised into 15 therapeutic categories. Each category includes all relevant diseases for the therapy area with easy navigation between diseases. The data shows a 10-year forecast (adjustable) split by gender, and 5-year age cohort giving a very detailed perspective of a treatment-eligible population.

Finally, geographic coverage is superior to the majority of available databases with patient populations fully available for 19 leading global markets with ongoing expansion in coverage.

You can find out more about the benefits and features of access to the Epiomic database at

New Call-to-action