Which Of The Following Are Not Research Data

Which of the Following Are Not Research Data

Research data is essential for scientific inquiry, analysis, and knowledge development. However, not everything collected or encountered during research qualifies as research data. Understanding what does not constitute research data is crucial for proper data management, ethical compliance, and accurate interpretation of research findings.

Introduction

In academic and scientific research, data serves as the foundation for drawing conclusions and advancing knowledge. Yet, many materials and information types exist that, despite being part of the research process, do not qualify as research data. These exclusions are important to recognize because they affect how information is stored, shared, and cited in scholarly work.

What Constitutes Research Data

Before identifying what is not research data, it helps to clarify what research data actually is. Research data typically includes:

Quantitative measurements and observations
Qualitative responses from participants
Experimental results and measurements
Survey responses and interview transcripts
Field notes and observational records
Statistical outputs and analysis results
Digital files generated during research (images, audio, video)

These materials are systematically collected, analyzed, and used to support or refute research hypotheses.

Materials That Are Not Research Data

Published Literature and References

Published books, journal articles, and other scholarly works that researchers cite in their studies are not considered research data. These materials serve as background information, theoretical frameworks, or comparative sources. While essential for contextualizing research, they are secondary sources rather than primary data collected during the study.

Research Proposals and Grant Applications

Documentation outlining research plans, methodologies, and funding requests are administrative materials. Though they guide the research process, they do not constitute data themselves. These documents describe what data will be collected rather than containing the actual data.

Laboratory Notebooks and Research Journals

Personal research notes, laboratory notebooks, and research journals often contain preliminary thoughts, procedural notes, and planning details. Unless these documents contain actual collected data, they remain as working documents rather than research data.

Software and Analytical Tools

The programs, algorithms, and software used to process or analyze data are tools rather than data themselves. While the code might be shared as part of open science initiatives, the software itself is a methodological instrument, not research data.

Metadata and Documentation

Information describing data collection methods, variable definitions, and data structure (metadata) facilitates data understanding but is not data itself. These are supporting documents that provide context for the actual research data.

Personal Communications and Informal Discussions

Conversations with colleagues, emails discussing research ideas, and informal brainstorming sessions are not research data. These communications may inspire research directions but do not constitute collected data unless formally recorded as part of the study.

Equipment and Physical Samples

The instruments used for measurement and the physical samples examined during research are objects, not data. While they generate data through interaction, the equipment and samples themselves are research materials rather than data.

Published Datasets and Secondary Data

When researchers use existing datasets collected by others, those original datasets are not their research data, though any new analysis or subset they create would be. Similarly, published statistics and aggregated data serve as sources rather than primary research data.

Ethical Approvals and Consent Forms

Documentation proving research ethics compliance, including institutional review board approvals and participant consent forms, are regulatory requirements. These documents ensure ethical standards but do not contain research data.

Drafts and Revision Histories

Multiple versions of research manuscripts, including rejected drafts and revision histories, document the writing process. These materials show research development but are not data themselves.

Personal Identifiable Information Not Collected for Research

Information about individuals that researchers possess but did not collect as part of their formal study is not research data. For example, a researcher knowing a participant's occupation from casual conversation, without recording it for the study, does not make that information research data.

Why the Distinction Matters

Understanding what constitutes research data has practical implications:

Data Management Planning: Institutions and funders require researchers to plan for data storage, sharing, and preservation. Knowing what qualifies as data ensures appropriate resource allocation.

Ethical Compliance: Research ethics committees review proposals based on data collection methods. Clear distinctions help in obtaining proper approvals.

Data Sharing and Open Science: Many journals and funders now require data sharing. Accurate identification of research data facilitates compliance with these requirements.

Reproducibility and Verification: Scientific integrity depends on others being able to verify findings. Properly distinguishing data from other research materials supports reproducibility.

Storage and Preservation: Data repositories have specific requirements and limitations. Understanding what qualifies as data ensures appropriate preservation strategies.

Conclusion

Research data forms the empirical foundation of scientific inquiry, but not all information encountered during research qualifies as data. Published literature, research proposals, laboratory notebooks, software, metadata, personal communications, equipment, secondary data, ethical documentation, drafts, and certain personal information are among the materials that, while important to the research process, are not research data themselves. Recognizing these distinctions supports proper data management, ethical compliance, and the advancement of transparent, reproducible research practices. As research methodologies continue to evolve, maintaining clear understanding of what constitutes research data remains essential for researchers, institutions, and the broader scientific community.

Best Practices for Identifying Research Data

To reliably separate data from ancillary materials, researchers can adopt a systematic checklist at the outset of a project. First, define the research question and specify the variables that will be measured or observed; these variables become the core data set. Second, document the provenance of each information item—whether it was generated by an instrument, derived from a survey, extracted from a database, or obtained through direct observation. Third, maintain a data dictionary that lists variable names, units of measurement, collection timestamps, and any transformations applied. Fourth, store raw data in a read‑only format (e.g., CSV, JSON, or proprietary instrument files) while keeping processed versions in separate, version‑controlled directories. Finally, link each data file to its associated metadata record in a repository or lab notebook, ensuring that the metadata describes context without becoming part of the data itself.

Tools and Technologies Modern research environments offer a variety of tools that help enforce the data‑non‑data boundary. Electronic lab notebooks (ELNs) often include templated forms for experiment protocols, allowing scientists to record observations in structured fields that export directly to data repositories. Workflow management systems such as Nextflow, Snakemake, or Galaxy capture the provenance of each computational step, making it easy to distinguish input data, intermediate files, and final results. Version‑control platforms like Git, when combined with data‑specific extensions (e.g., Git‑LFS or DataLad), enable researchers to track changes to data sets while keeping code and documentation in separate branches. Cloud‑based storage solutions provide bucket‑level policies that can restrict write access to raw data folders, reducing the risk of accidental modification. Leveraging these technologies not only streamlines identification but also creates an auditable trail that satisfies funder and journal requirements.

Training and Education

Embedding data‑identification competencies into graduate curricula and ongoing professional development reduces ambiguity at the source. Workshops that walk participants through real‑world examples—such as differentiating a laboratory notebook entry that records a temperature reading from the notebook’s narrative description of the experimental setup—help cement conceptual distinctions. Role‑playing exercises where trainees draft data management plans and then peer‑review each other’s plans for clarity about what constitutes data versus metadata or procedural text reinforce best practices. Institutions can further support learning by providing quick‑reference guides, FAQs, and decision trees that researchers can consult when unsure whether a particular item should be deposited in a data repository.

Case Studies

Clinical Trial Dataset – In a multi‑site Phase III trial, the primary outcome measures (blood pressure readings, adverse event codes) were designated as research data. The trial protocol, statistical analysis plan, and investigator meeting minutes were classified as non‑data documentation. By maintaining this separation, the team could deposit the raw outcome data in a public archive while retaining the protocol documents for internal review, satisfying both transparency requirements and regulatory confidentiality obligations.

Environmental Sensor Network – A university ecology project deployed autonomous sensors that logged temperature, humidity, and soil moisture every fifteen minutes. The sensor logs constituted the research data. The firmware version, deployment maps, and maintenance checklists were treated as ancillary material. When a sensor malfunctioned, the team could retrieve the exact log files for the affected period without needing to sift through unrelated configuration files, facilitating rapid troubleshooting and reproducibility of the published time‑series analysis.

Humanities Text Mining – Scholars compiling a corpus of digitized nineteenth‑century newspapers for sentiment analysis identified the cleaned, tokenized text files as research data. The original scanned images, OCR error reports, and scholarly commentary on source selection were recorded separately. This clear delineation allowed the group to share the tokenized corpus via a linguistic data repository while preserving the scanned images in a dark archive for future re‑OCR efforts, ensuring both accessibility and long‑term preservation.

Conclusion

Recognizing what qualifies as research data—and what does not—is a foundational skill that underpins effective data management, ethical compliance, and scientific reproducibility. By applying clear identification criteria, leveraging appropriate tools, investing in targeted training, and learning from concrete case studies, researchers can confidently curate their data sets while preserving the integrity of associated methodological and administrative records. As research practices grow increasingly interdisciplinary and data‑intensive, maintaining this distinction will continue to support transparent, trustworthy, and reusable science for the benefit of the entire scholarly community.

Which Of The Following Are Not Research Data

Which of the Following Are Not Research Data

Introduction

What Constitutes Research Data

Materials That Are Not Research Data

Published Literature and References

Research Proposals and Grant Applications

Laboratory Notebooks and Research Journals

Software and Analytical Tools

Metadata and Documentation

Personal Communications and Informal Discussions

Equipment and Physical Samples

Published Datasets and Secondary Data

Ethical Approvals and Consent Forms

Drafts and Revision Histories

Personal Identifiable Information Not Collected for Research

Why the Distinction Matters

Conclusion

Best Practices for Identifying Research Data

Training and Education

Case Studies

Conclusion

Latest Posts

Latest Posts

Which of the Following Are Not Research Data

Introduction

What Constitutes Research Data

Materials That Are Not Research Data

Published Literature and References

Research Proposals and Grant Applications

Laboratory Notebooks and Research Journals

Software and Analytical Tools

Metadata and Documentation

Personal Communications and Informal Discussions

Equipment and Physical Samples

Published Datasets and Secondary Data

Ethical Approvals and Consent Forms

Drafts and Revision Histories

Personal Identifiable Information Not Collected for Research

Why the Distinction Matters

Conclusion

Best Practices for Identifying Research Data

Training and Education

Case Studies

Conclusion

Latest Posts

Latest Posts

Related Posts