Princeton University Library Data and Statistical 

Search DSS

Finding Data Analyzing Data Citing data

About Us

DSS lab consultation schedule
Sep 1-Nov 4By appt. here
Nov 7-Dec 16Walk-in, 2-5 pm*
Dec 19-Feb 3By appt. here
Feb 6-May 5Walk-in, 1-5 pm*
May 8-May 16Walk-in, 2-5 pm*
May 17-Aug 31By appt. here
For quick questions email
*No appts. necessary during walk-in hrs.
Note: the DSS lab is open as long as Firestone is open, no appointments necessary to use the lab computers for your own analysis.

Follow DssData on Twitter
See DSS on Facebook

Finding Data: Data on Science, Technology, Computers, Internet


  • Selected Resources for:

    Science, Technology, Computers, Internet - non USA ::

  • 'Brain Drain' Debate in the United Kingdom, c.1950-1970
    Qualitative project. Sought to provide an analysis of the 'brain drain' debate of the 1950s and 1960s as a social phenomenon. The term 'brain drain' was adopted in the 1960s in the context of concerns the United Kingdom was losing skilled scientific and engineering personnel to other countries. Although the term is used in a variety of academic, policy and popular discussions about the international mobility of scientists, this project sought to rectify the absence of scholarly literature analyzing the original 'brain drain' debate. Comprised of 19 oral history interviews with scientists and engineers who emigrated to the United States or Canada in the 1950s or 1960s as well as British policymakers involved in any way in the 'brain drain' debate at this time. Also included is the transcript of a 'witness seminar' that brought officials and former emigres together to discuss their recollections. To obtain a free account please register with the UKDA.

  • Alliance for Audited Media (formerly Audit Bureau of Circulations)
    PDF reports of circulation statistics for newspapers, journals, and web publications. Princeton also has periodical geographic/circulation for spring 1999, spring 2004, 2014 as well as newspaper/geographic circulation for Spring 1999, Fall 2000, 2001-Spring 2015. Open ICPSR has "Circulation of US Daily Newspapers, 1924".

  • Amazon product data
    Contains product reviews and metadata from Amazon, including 143.7 million reviews spanning May 1996 - July 2014. Includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).

  • AUTM licensing survey (1991-2003)
    Survey of U.S. and Canadian universities, hospitals, research institutions and patent management firms. Survey results are reported in a summary report and comprehensive report. The comprehensive report, referred to as the Full Report, contains the Survey Summary and includes tables that present data obtained from individual respondents on an institution-by-institution basis.The CD-ROMs contain survey data for each fiscal year in Microsoft Excel spreadsheets.

  • Biennial Media Consumption Survey [1998 - 2002]
    Collects data on the public's use of, and attitudes toward, the Internet and traditional news outlets. Respondents are asked questions concerning their use of newspapers, television news, radio news, and news magazines. For later years, use Roper IPOLL.

  • Business R&D and Innovation Survey (BRDIS) (2008+)
    BRDIS expands NSF's coverage of business R&D and innovation activities. It is the successor to IRIS (Industrial Research and Development Information System). IRIS links an online interface to a historical database with more than 2,500 statistical tables. These tables, drawn from the results of the annual Survey of Industrial Research and Development (SIRD), contain all the industrial R&D data published by NSF from 1953-2007. IRIS data prior to 1999 are based on Standard Industrial Classification (SIC) and pre-SIC codes. Beginning with the 1999 survey, estimates are based on the North American Industrial Classification System (NAICS).

  • Campaigns in a New Media Age: How Candidates Use the World Wide Web to Win Elections
    "Martin Kifer, Michael Parkin, and Druckman are studying the impact of the Internet on electoral politics. Specifically, they have developed a theoretical framework for studying politicians' campaigns on the Web that accounts for political strategic aspects of Web-based campaigns and novel technical elements. They use the framework to guide a content analysis of over 1000 candidates Web sites over five election cycles. They complement these data with information on candidate and district characteristics to study a number of dynamics including how candidates' campaign on the Web, how Web campaign strategies differ from other types of media campaigning, why candidates Web sites differ from one another, how campaign Web sites have changed over time, and what effect Web campaigns might have in the future. They also have explored the websites of Members of Congress, and studied the effects of websites on voters' opinions (using experiments). At present data from three election studies is presently available. More data will be available in the near future." (Northwestern University)

  • CDP's Global Dataset
    Survey on the impacts of climate change and depletion of natural resources, globally. Surveys on emissions and environmental concerns from both city governments and investors. Datasets on emissions, forests, supply chains, and water. Questionnaires can be found online.

  • Comprehensive Epidemiologic Data Resource (CEDR)
    Department of Energy's database of health studies of DOE contract workers and environmental studies of areas surrounding DOE facilities.

  • DataRefuge
    Helps to build refuge for federal data and supports climate and environmental research and advocacy.

  • Excel file for RICE model as of April 26, 2010
    Examines alternative outcomes for emissions, climate change, and damages under different policy scenarios. It uses an updated version of the RICE model (Regional Integrated model of Climate and the Economy). New projections suggest that substantial future warming will occur if no abatement policies are implemented. The model also calculates the path of carbon prices necessary to keep the increase in global mean temperature to 2 degrees C or less in an efficient manner.

  • Global Digital Activism Data Set, February 2013
    Features coded cases of online digital activism from 151 countries and dependent territories. Several features from each case of digital activism were documented, including the year and month that online action commenced, the estimated age and country of origin of the initiator(s), the geographic scope of their campaign, and whether the action was online only, or also featured offline activities. Researchers were interested in the number and types of software applications that were used by digital activists. Specifically, information was collected on whether software applications were used to circumvent censorship or evade government surveillance, to transfer money or resources, to aid in co-creation by a collaborative group, or for purposes of networking, mobilization, information sharing, or technical violence (destructive/disruptive hacking). The collection illustrates the overall focus of each case of digital activism by defining the cause advanced or defended by the action, the initiator's diagnosis of the problem and its perceived origin, the identification of the targeted audience that the campaign sought to mobilize, as well as the target whose actions the initiators aimed to influence. Finally, each case of digital activism was evaluated in terms of its success or failure in achieving the initiator's objectives, and whether any other positive outcomes were apparent.

  • Great Plains Population and Environment Data (1870-2000)
    Collected information about approximately 500 counties in 12 states of the Great Plains of the United States, and then to analyze those data in order to understand the relationships between population and environment that existed between 1870-2000. The data distributed here are all data about counties. They fall into 4 broad categories: about the counties, about agriculture, about demographic and social conditions, and about the environment. The information about counties (name, area, identification code, and whether we classified the county as part of the Great Plains in a given year) are embedded in each of the other data files, so that there will be 3 series of data (agriculture, demographic and social conditions, and environment), with individual data files for each year for which data are available.

  • Imagining the Internet (Elon University) (2004+)
    Mission is to explore and provide insights into emerging network innovations, global development, dynamics, diffusion and governance.

  • Information and Communication Technology Survey (ICT)
    Provides data on both noncapitalized and capitalized spending for information and communication technology equipment and computer software by U.S. nonfarm businesses with employees. Data have been collected annually beginning with data for 2003.

  • International Investment and R&D Data Link (National Science Foundation)
    Presents data from a project that links business and industrial R&D survey data from National Center for Science & Engineering Statistics (NCSES) to international investment survey data from the Bureau of Economic Analysis. The combined data sets provide information on the R&D activities of U.S. multinational companies (MNCs) and foreign MNCs with U.S. activities.

  • Local Economic Impacts of Coal Mining in the United States 1870 to 1970
    Expands upon the current "resource curse" literature by using newly collected county data, spanning over a century, to capture the short- and long-run effects of coal mining activity.

  • Local Telephone Competition and Broadband Deployment (Federal Communication Commission Reports) (2000-2008 Zip Code; 2008-2013 Census Tract)
    Form 477 - Pre-2009 data is at the zip code level. Variable reported: Number of Internet Service Providers in the zip code who report there is at least one subscriber in the zip code for Internet services of at least 200 kilobit. This definition of high speed internet may now be outdated. Note that if there are 1, 2 or 3 providers in a zip code this is reported as one variable, * Note that after 2008 the FCC changed the geography to be Census Tracts and had different variables reported. For 2014+, also see Broadband Deployment Data from FCC Form 477: all facilities-based broadband providers are required to file data with the FCC twice a year (Form 477) on where they offer Internet access service at speeds exceeding 200 kbps in at least one direction. Fixed providers file lists of census blocks in which they can or do offer service to at least one location, with additional information about the service. Some historic data back to 2010. For other broadband data and some non-broadband reports/data back to 1991, also see the Internet Archive. Note that not all links included zipped files work.

  • Longitudinal Study of American Youth: Writing the history and monitoring the future of Generation X (LSAY)
    Designed to examine the development of: (1) student attitudes toward and achievement in science, (2) student attitudes toward and achievement in mathematics, and (3) student interest in and plans for a career in science, mathematics, or engineering, during middle school, high school, and the first 4 years post-high school, and to estimate the relative influence of parents, home, teachers, school, peers, media, and selected informal learning experiences on these developmental patterns. The older LSAY cohort, Cohort One, consisted of a national sample of 2,829 tenth-grade students in public high schools throughout the United States. These students were followed for an initial period of 7 years, ending 4 years after high school in 1994. Cohort Two, consisted of a national sample of 3,116 seventh-grade students in public schools that served as feeder schools to the same high schools in which the older cohort was enrolled. These students were followed for an initial period of 7 years, concluding with a telephone interview approximately one year after the end of high school in 1994. Beginning in the fall of 1987, the LSAY collected a wide array of information from each student, including: (1) a science achievement test and a mathematics achievement test each fall, (2) an attitudinal and experience questionnaire at the beginning and end of each school year, (3) reports about education and experience from all science and math teachers in each school, (4) reports on classroom practice by each science and math teacher serving an LSAY student, (5) an annual 25-minute telephone interview with one parent of each student, and (6) extensive school-level information from the principal of each study school. In 2006, the NSF funded a proposal to re-contact the original LSAY students (now in their mid-30's) to resume data collection to determine their educational and occupational outcomes. Through an extensive tracking activity, more than 95 % of the original sample of 5,945 LSAY students were located or accounted for. A new eligible sample of approximately 5,000 students was defined and these young adults were asked to complete a survey in 2007. For more information, also see the LSAY website.

  • National Agricultural Workers Survey (NAWS) 1989+
    Employment-based, random survey of the demographic, employment, and health characteristics of the U.S. crop labor force. Information is obtained directly from farm workers through face-to-face interviews. Since 1988, when the survey began, nearly 50,000 workers have been interviewed. Samples crop workers in 3 cycles each year to reflect the seasonality of agricultural production and employment. Workers are located at their farm job sites. During the initial contact, arrangements are made to interview the respondent at home or at another location convenient to the respondent.

    Sample Size: 1,500 to 4,000 workers are interviewed each year.

  • National Consumer Broadband Service Capability Survey
    The Broadband Data Improvement Act directed the Federal Communications Commission to conduct and make public periodic surveys of consumers as part of the FCC's efforts to understand who uses broadband, who does not, and, if not, why people do not subscribe.

  • National Sample Survey of Nurse Practitioners (NSSNP) (2012)
    Addresses data gaps in education, training, employment, and practice patterns for this population. Approximately 13,000 nurse practioners completed the 2012 survey.

  • National Sample Survey of Registered Nurses (1977-2008)
    Conducted approximately every 4 years since 1977. The data from these periodic surveys provide the basis for evaluating trends and projection of the future supply of nursing resources.

  • National Science Foundation Surveys of Public Attitudes Toward and Understanding of Science and Technology, 1979-2006
    Monitored the general public's attitudes toward and interest in science and technology. The survey assessed levels of literacy and understanding of scientific and environmental concepts and constructs such as DNA, probability, and experimental methods, how scientific knowledge and information were acquired, attentiveness to public policy issues, and computer access and usage. Since 1979, the survey was administered at regular intervals (occurring every 2 or 3 years), producing 12 cross-sectional surveys through 2006. Respondents were asked how they received information concerning science or news (e.g., via newspapers, magazines, or television), what types of television programming they watched, and what kinds of magazines they read. They were also asked if they agreed with statements concerning science and technology and how they affect everyday living. Respondents were further asked a series of true and false questions regarding science-based statements (e.g., the center of the Earth is hot, all radioactivity is manmade, electrons are smaller than atoms, etc.). Additional topics included whether the respondent had a postsecondary degree, field of highest degree, number of science-based college courses taken, major in college, household ownership of a computer, access to the World Wide Web, number of hours spent on a computer at home or at work, and topics searched for via the Internet. Demographic variables include gender, race, age, marital status, number of people in household, level of education, and occupation.

  • National Surveys on Energy and Environment [United States] (NSEE) (2008+)
    Include twice per year national opinion surveys on issues directly related to climate change and energy policy, as well as other surveys conducted on a range of topics such as hydraulic fracturing ("fracking"), the Great Lakes, and wider issues of energy and environment. From 2008-2012 the survey was called the "National Survey of American Public Opinion on Climate Change" (NSAPOOC); starting in 2013 the survey was renamed to the "National Surveys on Energy and Environment" (NSEE). Although the datasets are listed by survey wave, the NSEE is a valuable source of longitudinal public-opinion data on climate change and energy policy. Many questions have been asked over multiple waves, including questions about belief in global warming that have been asked in every wave of the NSEE. Also check Open ICPSR for additional rounds.

  • New York City Community Air Survey (NYCCAS)
    Evaluates how air quality differs across New York City. As part of the City's sustainability initiative, PlaNYC, this program studies how pollutants from traffic, buildings (boilers and furnaces), and other sources impact air quality in different neighborhoods. Monitors pollutants that cause health problems such as fine particles, nitrogen oxides, elemental carbon (a marker for diesel exhaust particles), sulfur dioxide and ozone. Although New York City air quality is improving, the Health Department estimates (PDF) that fine particle pollution alone caused an average of more than 2,000 deaths, approximately 1500 hospital admissions for lung and heart conditions, and 5,000 emergency department admissions for asthma based on levels in 2009-11. NYCCAS air pollution measurements are taken at about 100 locations throughout New York City in each season. Monitors are mounted 10 to 12 feet off the ground on public light poles or utility poles along streets and in some parks. The monitors use a small battery-powered pump and filters to collect air samples.

  • Open Data Flint
    Open access repository for data and data-related resources about the Flint, Michigan community. Aims to: (1) bring together data to help build the evidence base to achieve a healthier Flint community and (2) gain a deeper understanding of the far-reaching impact of the water crisis on the Flint population.

  • Pew Internet & American Life Project
    Produces reports that explore the impact of the Internet on families, communities, work and home, daily life, education, health care, and civic and political life. Raw data are released 6 months after a survey report has been published. Useful for topics such as gaming, social networking, dating, and online shopping. Some of their surveys are also contained in IPOLL.

  • Population Exposure Estimates in Proximity to Nuclear Power Plants, Locations
    Provides a global data set of point locations and attributes describing nuclear power plants and reactors.

  • Public Understanding of Science and Technology, NSF 1979-2006
    Used to monitor public attitudes toward a variety of science-related issues and topics since 1979. Has also been used to gauge how much the public knows about science and the scientific process, how interested people are in science, and where they get information about science.

  • Researching Environmental Economics at Princeton University
    Guide to environmental economics.

  • Roper Center - Public Opinion on Space Exploration
    Selection of surveys that look at how the American people feel about the money spent on space exploration; the practicality of a space based defense; the value of learning if there were ever living creatures on Mars; and if they believe in UFOs and beings from other planets.

  • Science and Engineering State Profiles (National Science Foundation)
    Contains data from 2003-12 and allows users to generate science and engineering profiles that summarize state-specific data on personnel and finances. The State Profiles data tool can display a single state's profile or a profile containing up to 10 states. Also see the Science and Engineering Indicators Data Tool.

  • Scientists and Engineers Statistical Data System
    In addition to SESTAT, a comprehensive and integrated system of information about the employment, educational, and demographic characteristics of scientists and engineers, this site also makes available several surveys of recipients of higher education. They include:
    • National Survey of Recent College Graduates (2001, 2003, 2006, 2008, 2010) discontinued
    • Survey of Doctorate Recipients (2001, 2003, 2006, 2008, 2010, 2013)
    • National Survey of College Graduates (1993, 2003, 2010, 2013, 2015)
    • International Survey of Doctorate Recipients (2010, 2013)

  • Survey of Graduate Students and Postdoctorates in Science and Engineering
    Provides data on the number and characteristics of students in graduate science and engineering and health-related fields enrolled in U.S. institutions. Assesses trends in financial support patterns and shifts in graduate enrollment and postdoctoral appointments.

  • Terra Populus: Integrated Data on Population and Environment
    Integrates the world's population and environmental data, including population censuses and surveys; land cover information from remote sensing; climate records from weather stations; and land use records from statistical agencies. Currently includes over 80 countries.

  • Townsend Thai Project
    Includes both annual and monthly panels, in addition to the collection of environmental data. Originally the Townsend Thai survey focused on villages in 4 provinces, 2 in the Northeast and 2 in the Central region. The baseline survey was conducted in 1997 and is referred to as the Big Survey. It includes a household, institutional and key informant (village leader) module. In 1998, the research team launched a monthly household survey. To date, the Townsend Thai project continues to resurvey the annual and monthly panels. In 2006, the annual surveys extended to include urban areas in the same 4 provinces. In 2003, an annual survey of villages in the South was added and in 2004, 2 provinces in the north were included in the annual survey.

  • Trends in International Mathematics and Science Study (TIMSS) (1995+)
    Provides reliable and timely data on the mathematics and science achievement of U.S. 4th- and 8th-grade students compared to that of students in other countries. Collected in 1995, 1999, 2003, 2007, and 2011. Next round of collection will be in 2015. Also see the NCES Bibliography for literature that has used this data.

  • United States Agriculture Data, 1840 - 2010
    Includes county-level data from the United States Censuses of Agriculture for the years 1840 to 2010. Provides data about the number, types, output, and prices of various agricultural products, as well as information on the amount, expenses, sales, values, and production of machinery. Most of the basic crop output data apply to the previous harvest year. Data collected also included the population and value of livestock, the number of animals slaughtered, and the size, type, and value of farms. Part 46 of this collection contains data from 1980 through 2010. Variables in part 46 include information such as the average value of farmland, number and value of buildings per acre, food services, resident population, composition of households, and unemployment rates.

  • University of Texas at Austin Energy Poll (2011-2013)
    Public opinion poll that measures and reports biannually (October and April) on consumer opinions and attitudes toward energy consumption, pricing, development and regulation.

  • World Telecommunication/ICT Indicators Database (2015 ed.)
    Contains time series data for 1960, 1965, 1970 and annually from 1975-2014 for around 180 different telecommunication and ICT statistics covering the telecommunication network and ICT uptake, mobile services, quality of service, traffic, staff, tariffs, revenue and investment. Data for over 200 economies are available. For select series more recent data may be available on the ITU website. Also see ITU Historical Statistics with select data from 1849-1967.

This page last updated: October 21, 2009