Bookmark


  • Page views 149
  • PDF Downloads 99


ISSN: 2766-2276
General Science Group 2024 October 11;5(10):1288-1290. doi: 10.37871/jbres2016.

 |   |   | 


open access journal Commentary

Hypotheses Generation through Team Science or Fishing Expedition

Gerhard A Coetzee*

Van Andel Institute, 333 Bostwick Ave NE, Grand Rapids, Michigan 49503, USA
*Corresponding authors: Gerhard A Coetzee, Van Andel Institute, 333 Bostwick Ave NE, Grand Rapids, Michigan 49503, USA E-mail:

Received: 08 October 2024 | Accepted: 11 October 2024 | Published: 11 October 2024
How to cite this article: Coetzee GA. Hypotheses Generation through Team Science or Fishing Expedition. J Biomed Res Environ Sci. 2024 Oct 11; 5(10): 1288-1290. doi: 10.37871/jbres2016, Article ID: jbres1757
Copyright:© 2024 Coetzee GA. Distributed under Creative Commons CC-BY 4.0.
Keywords
  • Hypothesis-generation
  • Fishing expedition
  • Team science

Modern biological science depends on hypotheses generation, which can best be obtained without pre-conceptions, and by collective teamwork and individual brilliance. Therefore, ‘open’ searches can lead to informative hypotheses generation that in turn can be framed, delved into and tested, ultimately leading to novel mechanistic insights and translation into informative diagnostics and therapeutics.

NIH study sections, among other funding agencies, often dismiss screening studies (also termed hypothesis generating), using the pejorative term ‘fishing expedition’. Such endeavors are considered high-risk and inherently non-mechanistic. However, if successful, they hold the potential for significant rewards as they lead to novel, innovative, and agnostic discoveries, in many cases yielding surprising results. By contrast, more traditional mechanistic studies, rooted in well-founded hypotheses, are often predictable and at best incremental; while they are typically deemed to be safe bets for funding due to their perceived continuity with the canon, they may inherently lack the transformative potential to shift paradigms. I accordingly contend that the tendency to dismiss hypothesis-generating screening studies as uncertain endeavors with unclear outcomes is a fallacy. A shift in perception is warranted and should be advocated.

Examples of genomic hypotheses-generating studies include

Genome-Wide Association Studies (GWAS) [1]. The human genome project [2], coupled with variations such as Single Nucleotide Polymorphisms (SNPs), has provided a wealth of data informing testable hypotheses to uncover genes involved in heritability [3].

Despite results that GWAS signals predominantly occur at non-coding DNA regions, ongoing research is elucidating the complex underpinnings of gene regulation at these non-coding DNA regions as manifested in many complex diseases, such as Parkinson’s Disease, for example [4].

The ENCODE project [5]. Again, when it became clear that by far the most genetically variable sites in the genome reside in DNA regions not coding for proteins, ENCODE data revealed that such sequences are in many cases chromatin structure-mediated enhancers controlling gene expression near or far on linear DNA; what previously was considered ‘junk DNA’ turned out to be functionally very important.

‘Decoding the Brain’ [6] is a latest summary of the PsychENCODE consortium, providing valuable annotations and atlases for brain gene expression regulation in healthy and disease conditions.

The Cancer Genome Atlas [7]. From the abstract: “The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein, and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, diferences, and emergent themes across tumor lineages”.

Foundational Data Initiative for Parkinson’s Disease (FOUNDIN-PD) [8]. By establishing 95 induced pluripotent stem cells with diferent genetic risk backgrounds, this consortium aims to screen for gene regulation/expression and imaging to establish mechanistic hypotheses for the genetic etiology of PD.

Many large data sets exist that can relatively easily be interrogated to generate testable hypotheses [9].

The studies referred to above have been highly successful in generating hypotheses that, when tested, yielded novel biological, functional, and mechanistic insight. Furthermore, because they are unbiased, they often yield surprising findings. We should drop the derogatory term ‘fishing expedition’. As pointed out above, many hypotheses-generating studies also entail large consortia of scientific teams. Therefore, the question arises as to how best to perform hypotheses-generating work. The American biological research enterprise has traditionally relied on relatively small-sized (5-10 persons) groups established around a single Principal Investigator (PI). Even today, such independent research is considered desirable for recruiting and promotion in American research institutions. Modern biological science, however, relies heavily on large teams due to the complexity of many aspects dealing with big data sets. For example, sophisticated statistical and bioinformatic analyses rely on ‘deep dives’ into modern technologies, analytical tools, and computation. Furthermore, AI-derived data analyses are increasingly becoming more and more sophisticated and need experts in information technology to separate junk or biases from useful information [10]. Individual innovative abilities and collaborations need to be fostered for the best results. In many cases, proximity in labs and offices may lead to fortuitous cross-fertilization of ideas, which are lacking when working remotely. Unexpected results can become most fruitful.

Finally, it should be stressed that obtaining the big screening data referred to above should not stop there but should be followed up by in-depth work digging into the hypotheses that were generated to obtain mechanistic insights.

  1. Kondratyev NV, Alfimova MV, Golov AK, Golimbet VE. Bench Research Informed by GWAS Results. Cells. 2021 Nov 15;10(11):3184. doi: 10.3390/cells10113184. PMID: 34831407; PMCID: PMC8623533.
  2. Dermitzakis ET. Genome-sequencing anniversary. Genome literacy. Science. 2011 Feb 11;331(6018):689-90. doi: 10.1126/science.1203237. PMID: 21310993.
  3. Via M, Gignoux C, Burchard EG. The 1000 Genomes Project: new opportunities for research and social challenges. Genome Med. 2010 Jan 21;2(1):3. doi: 10.1186/gm124. PMID: 20193048; PMCID: PMC2829928.
  4. Pierce SE, Booms A, Prahl J, van der Schans EJC, Tyson T, Coetzee GA. Post-GWAS knowledge gap: the how, where, and when. NPJ Parkinsons Dis. 2020 Sep 9;6:23. doi: 10.1038/s41531-020-00125-y. PMID: 32964108; PMCID: PMC7481221.
  5. Ecker JR, Bickmore WA, Barroso I, Pritchard JK, Gilad Y, Segal E. Genomics: ENCODE explained. Nature. 2012 Sep 6;489(7414):52-5. doi: 10.1038/489052a. PMID: 22955614.
  6. Nusinovich Y. Decoding the Brain. Science. 2024;384(6698):858-859.
  7. Cancer Genome Atlas Research Network; Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. 2013 Oct;45(10):1113-20. doi: 10.1038/ng.2764. PMID: 24071849; PMCID: PMC3919969.
  8. Bressan E, Reed X, Bansal V, Hutchins E, Cobb MM, Webb MG, Alsop E, Grenn FP, Illarionova A, Savytska N, Violich I, Broeer S, Fernandes N, Sivakumar R, Beilina A, Billingsley KJ, Berghausen J, Pantazis CB, Pitz V, Patel D, Daida K, Meechoovet B, Reiman R, Courtright-Lim A, Logemann A, Antone J, Barch M, Kitchen R, Li Y, Dalgard CL; American Genome Center; Rizzu P, Hernandez DG, Hjelm BE, Nalls M, Gibbs JR, Finkbeiner S, Cookson MR, Van Keuren-Jensen K, Craig DW, Singleton AB, Heutink P, Blauwendraat C. The Foundational Data Initiative for Parkinson Disease: Enabling efficient translation from genetic maps to mechanism. Cell Genom. 2023 Feb 6;3(3):100261. doi: 10.1016/j.xgen.2023.100261. PMID: 36950378; PMCID: PMC10025424.
  9. Rigden DJ, Fernández XM. The 2024 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res. 2024 Jan 5;52(D1):D1-D9. doi: 10.1093/nar/gkad1173. PMID: 38035367; PMCID: PMC10767945.
  10. Messeri L, Crockett MJ. Artificial intelligence and illusions of understanding in scientific research. Nature. 2024 Mar;627(8002):49-58. doi: 10.1038/s41586-024-07146-0. Epub 2024 Mar 6. PMID: 38448693.

Content Alerts

SignUp to our
Content alerts.


Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.


✨ Call for Preprints Submissions

Are you the author of a recent Preprint? We invite you to submit your manuscript for peer-reviewed publication in our open access journal.
Benefit from fast review, global visibility, and exclusive APC discounts.

Submit Now   Archive
?