This webpage provides informaton about the data files used in our web-based query system on injuries, EpiCenter.
Fatal Data, Nonfatal Patient Discharge Data, and Nonfatal Emergency Department Data
California Electronic Violent Death Reporting System (CalEVDRS)
Linked Crash Medical Outcomes (CMOD) Data
About the Fatal (Death) Data, Nonfatal Patient Discharge (Hospitalization) Data, and Nonfatal Emergency Department (ED) Data
Information about fatal injuries comes from the California Department of Public Health’s Death Statistical Master file
. These data come from death certificates that are registered in California each year. The SAC Branch uses this file to describe California residents who die as a result of injury (that is, whose death certificate includes an external cause of injury).
Prior to 1999, the cause of death was coded using the International Classification of Diseases, Ninth Revision (ICD-9). Beginning in 1999, deaths are coded using the Tenth revision of the ICD (ICD-10). These two revisions are significantly different. Users need to be aware that changes in the number of specific injuries observed over time may be due to changes in coding practices rather than true changes in causes of death. More information about ICD-10 and the effects of the change in coding can be found in our FAQ or at the National Center for Health Statistics.
Information about nonfatal injuries comes from the California Office of Statewide Health Planning and Development Patient Discharge Data (PDD) and Emergency Department (ED) Data. The PDD data set contains information on patients discharged from all non-Federal hospitals in California, and the ED data set contains information on patients who were admitted to an emergency department in California, then treated and released, or transferred to another facility. SAC uses this data to describe people who are hospitalized or in the ED as a result of an injury (that is, whose discharge diagnosis includes an external cause of injury (E-Code)).
Records for PDD and ED data represent the first hospitalization or ED visit for the injury in question, but may not be the only record for an individual person. Repeat visits for the same injury are not included in the file so each record represents an incident injury event. However, two separate injury events that require a hospitalization or ED visit would be counted twice in the PDD data or ED data. For example, a person who was hospitalized for a fall, was discharged to go home, and then fell again two weeks later would be counted in two separate records in the PDD data.
Take a look at this Excel file: About the Death, Hospitalization, and ED variables to find out more detail about all the variables included in our fatal and nonfatal injury data sets, including what we exclude from the data on EpiCenter.
Back to top
About the California Electronic Death Reporting System (CalEVDRS) Data
California’s violent death data come from two separate data systems – California’s Violent Death Reporting System (CalVDRS) and California’s Electronic Violent Death Reporting System (CalEVDRS). The former was administered by CDPH from 2005-2008 as part of CDC’s National Violent Death Reporting System (NVDRS). The latter system is funded by The California Wellness Foundation and was created by CDPH in response to issues with CDC’s system. CalEVDRS has been functioning and expanding since 2007.
CalEVDRS was built to be compatible with NVDRS by using the same data specifications. It does not use the same methodology, however, and that is why this data query makes a point to separate the two systems. CalVDRS data was manually abstracted from hardcopy records into CDC software by CDPH and county health department staff who were trained in abstracting for NVDRS. CalEVDRS data is mostly entered by coroner staff from participating counties. Although these staff were trained in abstracting according to NVDRS definitions, CalEVDRS funding has not been sufficient enough to ensure ongoing training and quality assurance of data.
CDPH employed Santa Clara County for both systems to evaluate the data quality of CalEVDRS. This evaluation showed that data are comparable overall and gave us insight to where further training was needed.
Besides greater efficiency of the CalEVDRS system and the need for ongoing training and data quality assurance, some differences between the two systems to keep in mind when using the data query:
· CDC does not consider Supplementary Homicide Reports (SHR) a primary data source for NVDRS so their software did not contain data fields for many SHR data elements that are in CalEVDRS data.
o For example, “drive-by shootings” is a circumstance in SHR but there is no place to code it in NVDRS software. Thus, CalVDRS cases of “drive-by shooting” could be underreported, compared to CalEVDRS data, because CalVDRS does not have the benefit of SHR detail.
o SHR also often contains more detail than coroner records on firearm type (i.e. whether it was a handgun vs. long gun). Again, the NVDRS software used by CalVDRS did not have a place where this information could be entered so detail on firearm type may be lacking in CalVDRS, compared to CalEVDRS. The SHR firearm detail was noted in a text field by abstractors and this text field was searched and recoded so some of these cases may be captured but since different abstractors may have written this note in many different ways, this information is much more difficult to capture than if a consistent data field were available.
Some other things to consider when interpreting these data:
· CalVDRS data are not available yet for 2008. CalEVDRS data are available through 2009 so there is a considerable gap in 2008 where much of the combined data are missing.
· Violent deaths in these systems are reported by the county where the injury occurred. This means that if an injury occurred outside one of the participating counties and the victim was transported to a hospital in one of the participating counties and died there, that victim would not be reported in this data query. Where a victim was injured in one of the participating counties and died outside the participating counties, that person would be included in these data, to the extent we were able to identify injury location. The injury location of a small percentage of these deaths was unknown. In these cases, the county of injury was assumed to be the same as where the death occurred.
· In 2005, Alameda County only reported violent deaths where the injury occurred in the City of Oakland or to residents of Oakland (regardless of where the injury occurred). This means Alameda County violent deaths, as reflected by occurrence in the data query, contain only those that occurred in Oakland or those victims who resided in either Oakland, San Francisco, or Santa Clara County and who were injured anywhere in Alameda County.
· A few peculiarities of the CalEVDRS system –
o The toxicology module was inadequately developed initially so data from 2007 through 2009 are very conservative. Positive toxicology results for these years should be interpreted as a minimum. The actual number of positive drug tests are likely higher. This module has been fixed in 2010 to capture more accurate toxicology data.
o The weapon module is separate from the rest of CalEVDRS data elements and this causes data entry staff to overlook entering this information. Coroner staff have been notified of this and asked to go back and enter weapon information. Data will be updated periodically but that is the reason for a high number of “unknown” weapon types.
· These data are compiled for the purpose of better understanding the circumstances of violent deaths. Hopefully, these data can be used to inform homicide and suicide prevention efforts and policies. However, care must be taken in interpreting these data. As much as the definitions, training, and data quality assurance are standardized, these data, like violent death reporting data in all states, are not perfect. They are documented initially by death investigators, each with their own methods and biases, from interviews with friends and family members of the victims, also each with their own biases. The information is then abstracted by different people, depending on the county. These people are trained to reduce bias and report data consistent with other abstractors but human variation is inevitable.
If you have any further questions about California’s Electronic Violent Death Reporting System please visit the CalEVDRS website or contact Steve Wirtz at (916) 552-9831 or Steve.Wirtz@cdph.ca.gov.
Back to top
About the Linked Crash Medical Outcomes (CMOD) Data
California’s Crash Medical Outcomes Data (CMOD) project is modeled on the National Highway Traffic Safety Administration (NHTSA) Crash Outcome Data Evaluation System (CODES). The CMOD project uses probabilistic linkage software, LinkSolv, to link data from police traffic crash records (i.e., scene investigations) to medical data (from emergency departments, hospitals, and, in a future update, death files). Probabilistic record linkage is useful when the data of interest come from two or more sources that do not have a common identifier for the same individual. Using information common to both the crash and medical files (like age, sex, date of injury) the linkage software mathematically decides whether two records are likely to refer to the same person.
The hospital outcome data include persons classified as an injured driver, passenger, pedestrian or bicyclist on a collision report. Persons who died as a result of their injuries are not included in either the hospital or emergency department dataset.
Description of Variables
Outcome – Nonfatal emergency department (treat and release or transferred) refers to patients treated in emergency departments but not admitted. The vast majority are treated and released. A small number are transferred to another hospital for in-patient admission.
Nonfatal Hospitalized refers to persons admitted as in-patients, whether or not they had been treated in an emergency department.
Age (available in two formats)
- Single year of age - Each year of age will appear on its own line (for example: 0, 1, 2, 3, 4 … up to 90+)
- 5-year age groups - These start with "0-4" and go to "85-89". Persons over 89 years old are included in the category "90+"
Race/Ethnicity - We combine two separate categories, race and Hispanic ethnicity, into a single race/ethnicity category. We also combine some categories together (such as combining Asian sub-groups into a single "Asian" category). We do this so that we have comparable groups both across time and between fatal and nonfatal data. If you need more detail than we provide, please contact us and we can discuss what we have available in our data.
Sex - This is the gender of the injured person.
Drug/Alcohol Diagnosis – Whether victim was diagnosed (primary or secondary) with alcohol or drug effects during the hospitalization or emergency department visit.
Crash temporal variables - The year, month, day of week, and time of day refer to when the collision occurred.
Role - As indicated on the collision report: motor vehicle driver, motor vehicle passenger, motorcyclist (includes motorcycle passenger), pedestrian, or bicyclist. The motorcyclist category in the CMOD query includes riders of motorized scooters and mopeds. Self-propelled scooter riders are classified as pedestrians, as are users of wheelchairs and similar mobility chairs.
Vehicle type - The type of vehicle the injured person was traveling in when collision occurred. CMOD categories include:
- passenger car (includes minivans and SUVs)
- motorcycle (includes motorized scooters and mopeds)
- pick-up/panel truck
- truck/truck tractor: a truck with two or more axles, or truck tractor, operated singly or with one or more semi-trailers or trailers
- bus (includes school bus)
- all other vehicles (includes emergency vehicles, highway construction and other vehicles)
- not stated
Type of collision - The general type of collision which was the first event.
Primary collision factor - The one circumstance or driving action which, in the officer’s opinion, best describes the primary or main cause of the collision.
Safety equipment use - For vehicle occupants this refers to use of safety restraints such as seat belts and child passenger safety seats. For motorcycle and bicycle riders, safety equipment refers to helmet use.
Seat position - Indicates whether the vehicle occupant was in a front versus any rear seat.
Region - County of collision is grouped into one of seven regions of the state developed by the UCLA Center for Health Policy Research. Northern and Sierra Counties: Butte, Shasta, Humboldt, Del Norte, Siskiyou, Lassen, Modoc, Trinity, Mendocino, Lake, Tehama, Glenn, Colusa, Sutter, Yuba, Nevada, Plumas, Sierra, Tuolumne, Calaveras, Amador, Inyo, Mariposa, Mono, and Alpine; Greater Bay Area: Santa Clara, Alameda, Contra Costa, San Francisco, San Mateo, Sonoma, Solano, Marin, and Napa; Sacramento Area: Sacramento, Placer, Yolo, and El Dorado; San Joaquin Valley: Fresno, Kern, San Joaquin, Stanislaus, Tulare, Merced, Kings, and Madera; Central Coast: Ventura, Santa Barbara, Santa Cruz, San Luis Obispo, Monterey, and San Benito; Los Angeles County: Los Angeles; Other Southern California: Orange, San Diego, San Bernardino, Riverside, and Imperial.
Alcohol involved collision - Traffic collision where any driver, pedestrian, or bicyclist involved in the crash had been drinking.
Drug involved collision - Traffic collision where any driver, pedestrian, or bicyclist involved in the crash was under the influence of one or more drugs.
Primary diagnosis (available in two formats): The primary (or principal) diagnosis is the chief reason the patient was admitted to the hospital or treated in the emergency department. The primary diagnosis may be the patient's most serious problem, but sometimes it is not.
Nature of injury - The type of injury, such as burn, fracture, or open wound.
Body part injured - The general region of the body injured, such as lower extremity, torso, or vertebral column.
Disposition on discharge – Where the patient is sent upon discharge. Common dispositions are released to home and transferred to another facility.
Length of stay – The number of days an in-patient stayed in the hospital. There are five categories, ranging from same day/overnight to more than one week in the hospital. Length of stay does not apply to patients treated in the emergency department.
Expected source of payment - The expected source of payment is the type of payer that is expected to pay or did pay the greatest portion of the bill for the hospital stay. Examples are private insurance, Medicare, and Medi-Cal.
If you have any further questions about California’s Crash Medical Outcomes Data please visit the CMOD website or contact Lynn Walton-Haynes at (916) 552-9835 or Lynn.Walton-Haynes@cdph.ca.gov.
Back to top
Population data are from the California Department of Finance (DOF)'s Demographic Research Unit. The data files used are the “Estimates of Race/Ethnic Population with Age and Gender Detail” data sets for 1990-1999 and 2000-2010, available on the DOF website. Demographic variables included on EpiCenter are: County, Age, Sex, and Race/Ethnicity. The data is used on EpiCenter to generate rates in some of our queries, and it also has its own query, if you would like further detail on California's demographics. Also note that the race/ethnicity categories in population data differ from the categories used in our fatal and nonfatal injury data sets. Population data includes "Unknown/Other" race/ethnicity in with "White" and also includes a "Multirace" category. These categories are included when you run the Population Data query. However, when rates are generated for our injury queries, the race/ethnicity categories displayed will be "White/Unknown/Other", but will not display a "Multirace" category, to make the injury data categories as comparable as possible to the population data categories.
Also, the California Department of Finance (DOF) did not start collecting and reporting data on "Multirace" until 2000. Therefore, the population numbers for specific races/ethnicities will differ quite drastically for years prior to 2000, compared to 2000 and later, since the population that was "multi-race" would have been categorized as one of the other races/ethnicites prior to 2000. For this reason, caution must be used when comparing population numbers or rates for years prior to 2000 and after 2000 when using race/ethnicity.
The data included on EpiCenter incorporate updates based on the U.S. 2010 Census. This change was made in November 2012, so if you looked at the population numbers or used them to develop rates before November 2012, your numbers/rates will be fairly different than with the new, updated numbers. If you still need more information about the population data, please contact the California Department of Finance's Demographic Research Unit.
Back to top